Valency Lexicon of Czech Verbs: Towards Formal Description of Valency and Its Modeling in an Electronic Language Resource
نویسنده
چکیده
Valency refers to the capacity of verb (or a word belonging to another part of speech) to take a specific number and type of syntactically dependent language units. Valency information is thus related to particular lexemes and as such it is necessary to describe valency characteristics for separate lexemes in the form of lexicon entries. A valency lexicon is indispensable for any complex Natural Language Processing application based on the explicit description of language phenomena. At the same time such lexicons are necessary for building language resources which provide the basis for tools using machine learning techniques. The present habilitation work consists of a collection of already published scientific papers. It summarizes the results of building a lexical database of Czech verbs. It concentrates on three essential topics. The first of them is the formal representation of valency properties of Czech verbs in the valency lexicon. The logical organization of richly structured lexicon data is presented here. The second topic concerns new theoretical issues that result from the extensive processing of language material, namely the concept of quasi-valency complementation and adequate processing of verb alternations. The third topic addresses questions of formal modeling of a natural language. A new formal model of dependency syntax based on a novel concept of restarting automata is introduced here. The main applied product of the work presented here is the publicly available Valency Lexicon of Czech Verbs VALLEX, a large-scale, high-quality lexicon which contains semantic and valency characteristics for the most frequent Czech verbs. VALLEX has been designed with emphasis on both human and machine-readability. Therefore, both linguists and developers of applications within the Natural Language Processing domain can use it.
منابع مشابه
Reflexive Verbs in a Valency Lexicon: The Case of Czech Reflexive Morphemes
In this paper, we deal with Czech reflexive verbs from the lexicographic point of view. We show that the Czech reflexive morphemes se and si constitute different linguistic meanings: either they are formal means of the word formation process of the so called reflexivization, or they are associated with the syntactic phenomena of reflexivity, reciprocity, and diatheses. All of these processes ar...
متن کاملValency Frames Of Czech Verbs In VALLEX 1.0
The Valency Lexicon of Czech Verbs, Version 1.0 (VALLEX 1.0) is a collection of linguistically annotated data and documentation, resulting from an attempt at formal description of valency frames of Czech verbs. VALLEX 1.0 is closely related to Prague Dependency Treebank. In this paper, the context in which VALLEX came into existence is briefly outlined, and also three similar projects for Engli...
متن کاملValency Lexicon of Czech Verbs
Valency is a property of language units reflecting their combinatorial potential in language utterances. The availability of the information about valency is supposed to be crucial in various Natural Language Processing tasks. In general, valency of language units cannot be automatically predicted, and therefore it has to be stored in a lexicon. The primary goal of the presented work is to crea...
متن کاملValency Lexicon for Czech: From Verbs to Nouns
Valency lexicon of Czech verbs has been intensively worked on for more than a year, and now we have at our disposal a detailed description of valency frames of several hundreds verbs. Presently, the challenge naturally arises, to use the existing lexicon for capturing valency of other word classes. In this paper, we focus on valency of nouns derived from verbs. We propose an algorithm for autom...
متن کاملThe Representation of Diatheses in the Valency Lexicon of Czech Verbs
In the present paper, we deal with diatheses in Czech from a lexicographic point of view. We propose a method of their description in the valency lexicon of Czech verbs VALLEX. We distinguish grammatical and semantic diatheses as two typologically different changes in verbal valency structure. In case of grammatical diatheses, these changes are regular enough to be described by formal syntactic...
متن کامل